Skip to content
This repository was archived by the owner on Jun 10, 2020. It is now read-only.

Define special methods for ndarray and add more extensive tests. #10

Merged
merged 17 commits into from
Mar 13, 2018

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Feb 20, 2018

No description provided.

array_scalar = np.array(1)
int(array_scalar)
float(array_scalar)
complex(array_scalar)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy current errors on this line: tests/test_simple.py:41: error: No overload variant of "complex" matches argument types [numpy.ndarray]

Is this a mypy bug or am I doing something wrong?

Copy link
Contributor

@emmatyping emmatyping Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a bug it would appear. Since ndarray subscribes to the SupportsComplex protocol (sorry, I missed that initially), I see no reason this shouldn't work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reported at python/mypy#4592

@shoyer shoyer changed the title Define a special methods for ndarray and add more extensive tests. Define special methods for ndarray and add more extensive tests. Feb 20, 2018
@shoyer
Copy link
Member Author

shoyer commented Feb 28, 2018

@eric-wieser Want to take a look?

@shoyer
Copy link
Member Author

shoyer commented Mar 6, 2018

@alanhdu any thoughts?

def __float__(self) -> float: ...
def __complex__(self) -> complex: ...
def __oct__(self) -> str: ...
def __hex__(self) -> str: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be conditional on the python version. I think mypy supports that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it supports limited version checking with sys.version_info.

len(array)
str(array)
array + 1
-array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to test all the binary operators here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's probably a good idea.

def __getattr__(self, name) -> Any: ...


def array(
object: object,
dtype: dtype = ...,
dtype: _dtype_class = ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_dtype_class is "things which can be converted to a dtype"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's literally the dtype class. This should be relaxed to something more flexible, e.g., Union[Type[np.generic], str, dtype] (plus probably some other types for structured dtypes).

def __hex__(self) -> str: ...
if sys.version_info.major < 3:
def __oct__(self) -> str: ...
def __hex__(self) -> str: ...
def __nonzero__(self) -> bool: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called bool on py3

@@ -83,7 +109,8 @@ class dtype:
def type(self) -> builtins.type: ...


_dtype_class = dtype # for ndarray type
_DtypeLike = Union[dtype, _ConvertibleToDtype]
_Dtype = dtype # to avoid name conflicts with ndarray.dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could have a "strict-typing" mode that narrowed the scope of what dtype-like or array-like means. I wonder how that could be achieved...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd love see this.

# are typed as Any.
# Refernce: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
_ConvertibleToDtype = Union[
type, # TODO: enumerate np.generic types and Python scalars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Type[np.generic] work here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but we haven't defined the scalar type hierarchy yet.

# definitions, any nested fields are required to be castable to a dtype object
# are typed as Any.
# Reference: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
_ConvertibleToDtype = Union[
Copy link
Member Author

@shoyer shoyer Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently (with mypy) I can actually include a forward reference to dtype in this top-level type alias. If that's a standard feature for pyi files, then I suppose we should make use of it rather than defining the separate _DtypeLike alias below.

Copy link
Member Author

@shoyer shoyer Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this doesn't work. It does work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a commit implementing this.

# TODO: add a protocol for anything with a dtype attribute
str,
Tuple[Any, int],
Tuple[Any, _Shape],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could combine these with a ShapeLike - which is consistent with how np.zeros(shape_like) and a bunch of other functions work.

List[Union[Tuple[Union[str, Tuple[str, str]], Any],
Tuple[Union[str, Tuple[str, str]], Any, _Shape]]],
Dict[str, Any],
Tuple[Any, Any]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think each of these could do with a comment giving an example of the case they aim to cover

# (flexible_dtype, itemsize)
Tuple[Any, int],
# (fixed_dtype, shape)
Tuple[Any, _Shape],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that (int, 2) is shorthand for (int, (2,)), so this the int case covers more than you might expect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in that case I'm going to use _ShapeLike = Union[int, Tuple[int, ...]] for clarity.


from numpy.core._internal import _ctypes

_Shape = Tuple[int, ...]

# Anything that can be coerced into numpy.dtype. To avoid recursive
# definitions, any nested fields are required to be castable to a dtype object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps do _DtypeLike_nested = Any # TODO: wait for support for recursive types, and then use it in place of Any below?

def __nonzero__(self) -> bool: ...
else:
def __bool__(self) -> bool: ...
def __bytes__(self) -> bytes: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is python 3 only?

Also __unicode__, __str__, and __repr__ are missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# {'names': ..., 'formats': ..., 'offsets': ..., 'titles': ...,
# 'itemsize': ...} or {'field1': ..., 'field2': ..., ...}
# TODO: use TypedDict when/if it's officially supported
Dict[str, Any],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really Any? I think there could be:

Union[
    Dict[str, Sequence[Union[int, str, _DtypeLike]]]   # for {"names": ..., "formats": ...}
    Dict[str, Union[Tuple[str, int], Tuple[_DtypeLike, int]]] # for {"field": ....}
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be


    # for {"names": ..., "formats": ...}
    Dict[str, Union[
        Sequence[str],  #names
        Sequence[_DtypeLike], # formats
        Sequence[int], # offsets
        int,  # itemsize
    ]],
    Dict[str, Tuple[_DtypeLike, int]] # for {"field": ....}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably dangerous because dict is an invariant type (so, for example, Dict[str, int] wouldn't be compatible with this big union type). Can you use Mapping instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use Mapping, but it would be inaccurate. NumPy does actually insist on dict:

In [13]: od = collections.OrderedDict(x=(float, 1), y=(int, 2))

In [14]: np.dtype(od)
Out[14]: dtype({'names':['x','y'], 'formats':['<f8','<i8'], 'offsets':[1,2], 'itemsize':10})

In [15]: np.dtype(ChainMap([od]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-03baf5fab980> in <module>()
----> 1 np.dtype(ChainMap([od]))

TypeError: data type not understood

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the same problem with:

   List[Union[
       Tuple[Union[str, Tuple[str, str]], _DtypeLikeNested],
       Tuple[Union[str, Tuple[str, str]], _DtypeLikeNested, _ShapeLike]]],

which maybe should be Sequence but NumPy insists on taking a list (or at least, it doesn't take a tuple which is my go-to example of a non-list sequence).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the more specific types for Dict, which should at least minimize the invariance issue.

Is it possible to make a dict that is covariant in its argument type? e.g.,

K = TypeVar('V', covariant=True)
V = TypeVar('V', covariant=True)
def f(x: Dict[K, V]): ...

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this need changing? Why?

@eric-wieser
Copy link
Member

(re the text/Unicode commit)

@shoyer
Copy link
Member Author

shoyer commented Mar 8, 2018

Flake8 didn't like using unicode on Python 3. I guess I could have used a comment to disable the flake8 check but this was an easy work around.

def __abs__(self) -> ndarray: ...
def __invert__(self) -> ndarray: ...

# TODO(shoyer): remove when all methods are defined
def __getattr__(self, name) -> Any: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing __setattr__ for .dtype and .shape?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype and shape are only typed as attributes, not properties, which means they can be set. But perhaps it would indeed be good to overload setters appropriately...

@shoyer shoyer merged commit 3074ca3 into numpy:master Mar 13, 2018
@shoyer shoyer deleted the special-methods branch March 13, 2018 15:41
@shoyer
Copy link
Member Author

shoyer commented Mar 13, 2018

@eric-wieser @alanhdu thanks for the careful reviews!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants